Goto

Collaborating Authors

 topological data analysis


ToDD: TopologicalCompoundFingerprintingin Computer-AidedDrugDiscovery

Neural Information Processing Systems

In computer-aided drug discovery (CADD), virtual screening (VS) is used for identifying the drug candidates that are most likely tobind toamolecular target inalargelibraryofcompounds.





From Betti Numbers to Persistence Diagrams: A Hybrid Quantum Algorithm for Topological Data Analysis

Liu, Dong

arXiv.org Artificial Intelligence

Persistence diagrams serve as a core tool in topological data analysis, playing a crucial role in pathological monitoring, drug discovery, and materials design. However, existing quantum topological algorithms, such as the LGZ algorithm, can only efficiently compute summary statistics like Betti numbers, failing to provide persistence diagram information that tracks the lifecycle of individual topological features, severely limiting their practical value. This paper proposes a novel quantum-classical hybrid algorithm that achieves, for the first time, the leap from "quantum computation of Betti numbers" to "quantum acquisition of practical persistence diagrams." The algorithm leverages the LGZ quantum algorithm as an efficient feature extractor, mining the harmonic form eigenvectors of the combinatorial Laplacian as well as Betti numbers, constructing specialized topological kernel functions to train a quantum support vector machine (QSVM), and learning the mapping from quantum topological features to persistence diagrams. The core contributions of this algorithm are: (1) elevating quantum topological computation from statistical summaries to pattern recognition, greatly expanding its application value; (2) obtaining more practical topological information in the form of persistence diagrams for real-world applications while maintaining the exponential speedup advantage of quantum computation; (3) proposing a novel hybrid paradigm of "classical precision guiding quantum efficiency." This method provides a feasible pathway for the practical implementation of quantum topological data analysis.


Deep Learning with Topological Signatures

Christoph Hofer, Roland Kwitt, Marc Niethammer, Andreas Uhl

Neural Information Processing Systems

Inferring topological and geometrical information from data can offer an alternative perspective on machine learning problems. Methods from topological data analysis, e.g., persistent homology, enable us to obtain such information, typically in the form


The Shape of Data: Topology Meets Analytics. A Practical Introduction to Topological Analytics and the Stability Index (TSI) in Business

Diamantis, Ioannis

arXiv.org Machine Learning

Modern business and economic datasets often exhibit nonlinear, multi-scale structures that traditional linear tools under-represent. Topological Data Analysis (TDA) offers a geometric lens for uncovering robust patterns, such as connected components, loops and voids, across scales. This paper provides an intuitive, figure-driven introduction to persistent homology and a practical, reproducible TDA pipeline for applied analysts. Through comparative case studies in consumer behavior, equity markets (SAX/eSAX vs.\ TDA) and foreign exchange dynamics, we demonstrate how topological features can reveal segmentation patterns and structural relationships beyond classical statistical methods. We discuss methodological choices regarding distance metrics, complex construction and interpretation, and we introduce the \textit{Topological Stability Index} (TSI), a simple yet interpretable indicator of structural variability derived from persistence lifetimes. We conclude with practical guidelines for TDA implementation, visualization and communication in business and economic analytics.


Topology of Currencies: Persistent Homology for FX Co-movements: A Comparative Clustering Study

de Jeneret, Pattravadee de Favereau, Diamantis, Ioannis

arXiv.org Machine Learning

This study investigates whether Topological Data Analysis (TDA) can provide additional insights beyond traditional statistical methods in clustering currency behaviours. We focus on the foreign exchange (FX) market, which is a complex system often exhibiting non-linear and high-dimensional dynamics that classical techniques may not fully capture. We compare clustering results based on TDA-derived features versus classical statistical features using monthly logarithmic returns of 13 major currency exchange rates (all against the euro). Two widely-used clustering algorithms, \(k\)-means and Hierarchical clustering, are applied on both types of features, and cluster quality is evaluated via the Silhouette score and the Calinski-Harabasz index. Our findings show that TDA-based feature clustering produces more compact and well-separated clusters than clustering on traditional statistical features, particularly achieving substantially higher Calinski-Harabasz scores. However, all clustering approaches yield modest Silhouette scores, underscoring the inherent difficulty of grouping FX time series. The differing cluster compositions under TDA vs. classical features suggest that TDA captures structural patterns in currency co-movements that conventional methods might overlook. These results highlight TDA as a valuable complementary tool for analysing financial time series, with potential applications in risk management where understanding structural co-movements is crucial.


Persistent Homology of Topic Networks for the Prediction of Reader Curiosity

Hopp, Manuel D. S., Labatut, Vincent, Amalvy, Arthur, Dufour, Richard, Stone, Hannah, Jach, Hayley, Murayama, Kou

arXiv.org Artificial Intelligence

Reader curiosity, the drive to seek information, is crucial for textual engagement, yet remains relatively underexplored in NLP. Building on Loewenstein's Information Gap Theory, we introduce a framework that models reader curiosity by quantifying semantic information gaps within a text's semantic structure. Our approach leverages BERTopic-inspired topic modeling and persistent homology to analyze the evolving topology (connected components, cycles, voids) of a dynamic semantic network derived from text segments, treating these features as proxies for information gaps. To empirically evaluate this pipeline, we collect reader curiosity ratings from participants (n = 49) as they read S. Collins's ''The Hunger Games'' novel. We then use the topological features from our pipeline as independent variables to predict these ratings, and experimentally show that they significantly improve curiosity prediction compared to a baseline model (73% vs. 30% explained deviance), validating our approach. This pipeline offers a new computational method for analyzing text structure and its relation to reader engagement.